The First Workshop on Language Models for Low-Resource Languages

Important Dates:
NLP-LoResLM

Paper submission due	December 31, 2025
First Decision	March 31, 2026 - April 30, 2026
Revised Version Submission	May 1, 2026 - June 1, 2026
Final Decision	August 30, 2026

Important Dates
LoResLM at EACL 2026

Paper submission due	January 6 (Tue), 2026
Notification of acceptance	January 28 (Wed), 2026
Camera-ready due	February 3 (Thu), 2026
Workshop	March 28, 2026- March 29, 2026 (TBD) co-located with EACL 2026
* These dates are approximate dates based on EACL 2026 and are subject to changes.

News

October 27, 2025: We are organising the second iteration of the workshop at EACL 2026. Check out the Call for Papers.
September 24, 2025: We are organising a special issue in Journal Natural Language Processing. More details are here.
January 21, 2025: We successfully concluded LoResLM 2025. Thank you everyone who contributed. The proceedings are available here.

Neural language models have revolutionised natural language processing (NLP) and have provided state-of-the-art results for many tasks. However, their effectiveness is largely dependent on the pre-training resources. Therefore, language models (LMs) often struggle with low-resource languages in both training and evaluation. Recently, there has been a growing trend in developing and adopting LMs for low-resource languages. We aim to provide a forum for researchers to share and discuss their ongoing work on LMs for low-resource languages.

Background

Globally, there are approximately 7,000 spoken languages (van Esch et al., 2022), yet most NLP research focuses only on about 20 languages with high resources (Magueresse et al., 2020). The remaining numerous languages that receive little research attention are commonly known as low-resource languages. Even though these languages represent significant global communities, they generally lack sufficient digital data and resources to support NLP tasks or benefit from recent advancements in the field (Ruder et al., 2022).

Neural language models, particularly transformers and large language models (LLMs), have revolutionised NLP, achieving state-of-the-art results in many tasks (Touvron et al., 2023; Minaee et al., 2024). However, since the capabilities of language models (LMs) are also primarily determined by the characteristics of their pre-trained language corpora, disparities in language resources are evident within the models. Therefore, LMs often struggle with low-resource languages in training and evaluation despite their strong performance with high-resource languages (Blasi et al., 2022).

Following this bias in NLP approaches towards high-resource languages, which negatively affects a significant portion of the global community, there has been a growing trend in developing and adopting LMs for low-resource languages to promote linguistic fairness. To support and strengthen this movement, this workshop aims to provide a forum for researchers to share and discuss their ongoing work on LMs for low-resource languages. We mainly target to encourage the development of LM-based approaches and compile a research collection that supports ongoing and future research in this area, building on recent advancements in LMs.

References

Damian Blasi, Antonios Anastasopoulos, and Graham Neubig. 2022. Systematic inequalities in language technology performance across the world’s languages. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages 5486–5505, Dublin, Ireland. Association for Computational Linguistics

Alexandre Magueresse, Vincent Carles, and Evan Heet- derks. 2020. Low-resource languages: A review of past work and future challenges. arXiv preprint arXiv:2006.07264.

Shervin Minaee, Tomas Mikolov, Narjes Nikzad, Meysam Chenaghlu, Richard Socher, Xavier Amatriain, and Jianfeng Gao. 2024. Large language models: A survey. arXiv preprint arXiv:2402.06196.

Sebastian Ruder, Ivan Vulic, and Anders Søgaard. ´ 2022. Square one bias in NLP: Towards a multidimensional exploration of the research manifold. In Findings of the Association for Computational Linguistics: ACL 2022 , pages 2340–2354, Dublin, Ireland. Association for Computational Linguistics.

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288

Daan van Esch, Tamar Lucassen, Sebastian Ruder, Isaac Caswell, and Clara Rivera. 2022. Writing system and speaker metadata for 2,800+ language varieties. In Proceedings of the Thirteenth Language Resources and Evaluation Conference , pages 5035–5046, Marseille, France. European Language Resources Association

Contact us

Stay in touch to receive updates about LoResLM 2025

Language Models for Low-Resource Languages

Building Bridges through NLP Innovation:Empowering Linguistic Diversity by Crafting Language Models for Low-Resource Languages

Important Dates: NLP-LoResLM